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Abstract 

Many  existing  survivability  mechanisms  rely  on  soft¬ 
ware-based  system  monitoring  and  control.  Some  of  the 
software  resides  on  application  hosts  that  are  not  necessar¬ 
ily  trustworthy.  The  integrity  of  these  software  components 
is  therefore  essential  to  the  reliability  and  trustworthiness 
of  the  survivability  scheme.  In  this  paper  we  address  the 
problem  of  protecting  trusted  software  on  untrustworthy 
hosts  by  software  transformations.  Our  techniques  include 
a  systematic  introduction  of  aliases  in  combination  with  a 
‘‘break-down’’  of  the  program  control-flow;  transforming 
high-level  control  transfers  to  indirect  addressing  through 
aliased  pointers.  In  so  doing,  we  transform  programs  to  a 
form  that  yields  data  flow  information  very  slowly  and/or 
with  little  precision.  We  present  a  theoretical  result  which 
shows  that  a  precise  analysis  of  the  transformed  program, 
in  the  general  case,  is  NP-hcird  and  demonstrate  the  appli¬ 
cability  of  our  techniques  with  empirical  results. 

1.  Introduction 

Jn  building  survivable  systems,  many  existing  mecha¬ 
nisms  [8,  9]  rely  on  software-based  network  monitoring  and 
management.  Because  some  of  the  software  components 
for  the  survivability  mechanism  will  execute  on  hosts  that 
are  not  necessarily  trusted,  the  reliability  and  trustworthi¬ 
ness  of  the  survivability  mechanism  is,  therefore,  of  a  great 
concern. 

In  this  paper,  we  address  the  problem  of  software  pro¬ 
tection  in  a  potentially  malicious  environment.  We  study 
this  problem  within  the  context  of  a  survivable  distributed 
system  [9],  In  this  system,  software  probes  are  deployed 
onto  network  nodes  for  monitoring  and  control  purposes. 
These  probes  are  dispatched  from  a  set  of  trusted  servers. 
Each  probe  may  employ  different  algorithms  for  monitor¬ 
ing  local  information  and  for  communication  with  the  serv¬ 
ers.  For  instance,  different  probes  might  use  different  data 
sequences,  transmit  with  a  different  protocol,  or  monitor 
different  information.  To  defeat  this  network-wide  monitor¬ 
ing  mechanism,  and  thereby  obtaining  control  of  the  net¬ 
work,  an  adversary  must  deduce  either  the  algorithm  that 


the  probe  uses  when  monitoring  or  the  protocol  with  which 
the  probe  communicates  with  the  server.  Each  of  these 
attacks  requires  some  level  of  understanding  of  the  program 
behavior,  which  can  be  obtained  through  program  analysis. 
This  paper  addresses  one  important  aspect  of  software  pro¬ 
tection — prevention  of  static  analysis  of  programs. 

Static  program  analysis  can  reveal  a  great  deal  of  infor¬ 
mation  about  the  program  such  as  the  control  flow  and  pos¬ 
sible  uses  of  data  quantities  at  run-time  [11],  This 
information  can  be  used  to  facilitate  dynamic  analysis  of 
the  program,  and  in  some  cases,  aid  direct  tampering  with 
the  program.  In  this  paper,  we  introduce  a  compiler-based 
approach  to  harden  software  against  static  analysis.  The 
basic  approach  consists  of  a  set  of  code  transformations  that 
are  designed  to  obstruct  static  analysis.  The  key  difference 
between  our  approach  and  previously  proposed  code-obfus- 
cation  techniques  [4,  5,  7]  is  that  our  techniques  are  sup¬ 
ported  by  both  theoretical  and  empirical  complexity 
measures.  Without  the  complexity  measures,  code-obfusca¬ 
tion  techniques  are  at  best  ad  hoc. 

The  problem  of  software  protection  has  been  investi¬ 
gated  in  other  studies.  The  notable  ones  include  INTEL’S 
IVK  project  [2],  Collburg’s  code  obfuscation  work  [4,  5] 
and  mobile  cryptography  [20].  The  IVK  work  coined  the 
phrase  Tamper  Resistant  Software.  Their  technique  was 
novel  but  came  with  the  price  of  considerable  run-time  cost. 
The  mobile  cryptography  study  proposed  a  technique  to 
execute  programs  in  an  encrypted  form.  In  its  present  form, 
the  technique  has  limited  applicability  (e.g.,  rational  func¬ 
tions). 

The  approach  described  in  this  paper  is  developed 
based  on  well-understood  programming  language  princi¬ 
ples,  which  serve  as  the  basis  for  the  complexity  measures. 
We  structure  the  paper  as  follows:  In  section  2,  we  present 
the  system  model  and  assumptions  on  which  this  work  is 
based.  Section  3  describes  the  basics  of  static  analysis.  Sec¬ 
tions  4  and  5  present  the  transformations  to  hinder  control- 
flow  and  data-flow  analysis.  Sections  6  discusses  theoreti¬ 
cal  and  practical  foundations  of  the  proposed  scheme.  Sec¬ 
tions  7  presents  our  implementation  and  experimental 
results. 
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2.  The  system  model 

In  this  section  we  describe  the  assumptions  and  the 
system  model  to  set  the  context  for  discussion.  Our  system 
consists  of  a  set  of  computing  hosts  connected  via  a  net¬ 
work  and  a  set  of  communicating  processes  running  on 
these  hosts.  The  hosts  are  divided  into  two  categories: 
application  hosts  and  survivability  control  hosts.  The  pro¬ 
cesses  relevant  to  the  survivability  control  mission  are  the 
control  processes  running  on  the  control  hosts  and  the 
probe  programs  running  on  the  application  hosts.  The 
probes  are  responsible  for  local  monitoring  and  reconfigu¬ 
ration.  The  control  processes  collect  monitoring  informa¬ 
tion  from  the  probes,  conduct  network-wide  analysis,  and 
issue  reconfiguration  commands  to  the  probes  if  real-time 
changes  are  deemed  necessary.  An  overview  of  the  system 
architecture  is  depicted  in  Figure  1. 

Several  characteristics  and  assumptions  about  the  sys¬ 
tem  are  important  for  the  discussion.  They  are  listed  below: 

»  Trusted  control  servers:  In  our  system,  the  control 
servers  and  the  control  processes  running  on  top  of  them  are 
presumed  trusted. 

•  Trusted  network  communications:  We  assume  the 
network  communications  between  the  control  processes  and 
the  software  probes  are  trusted. 

•  Diversity  in  the  probing  mechanism:  In  this  system, 
the  probing  mechanism  makes  use  of  two  forms  of  diversity 
that  are  essential  to  the  approach  detailed  in  later  sections. 
They  are  temporal  diversity  and  spatial  diversity.  Temporal 
diversity  takes  the  form  of  periodic  replacement  of  the  probes 

with  a  new  version  dispatched  from  the  trusted  control 
servers.  Spatial  diversity  refers  to  the  installation  of  different 
versions  of  probes  across  the  network.  Each  version  of  the 
probes  may  use  a  different  probing  algorithm,  a  different 
protocol  to  communicate  with  the  control  server,  and  may 
appear  to  have  a  different  operational  semantics  [21],  The 
use  of  diversity  here  makes  it  essential  for  an  adversary  to 


Figure  1:  A  controlled  network 


learn  the  program  algorithm  in  order  to  launch  an  intelligent 
tampering  or  impersonation  attack. 

•  High  level  of  interactions  between  the  probes  and  the 
control  processes:  While  executing  on  a  remote  host,  the 
probes  maintain  a  high  level  of  interaction  with  its  control 
server  via  predetermined  protocols.  It  is  assumed  that  the 
probes  perform  integrity  checks  whose  results  are  verified 
by  the  control  servers.  The  checking  mechanisms  are  also 
installation-unique  in  that  each  probe  program  may  employ 
a  set  of  different  checks.  It  is  not  the  task  of  this  paper  to 
devise  the  checking  mechanisms.  It  suffices  to  state  that  the 
integrity  checks  may  be  performed  on  the  software  itself  as 
well  as  its  executing  environment.  The  result  of  the  integrity 
checks  establishes  the  basis  of  the  probe’s  identity  and 
authenticity. 

In  this  work,  we  are  primarily  interested  in  defending 
against  sophisticated  attacks  that  fall  under  the  category  of 
intelligent  tampering  and  impersonation  attacks. 

•  Intelligent  Tampering.  Intelligent  tampering  refers  to 
scenarios  in  which  an  adversary  modifies  the  program  or  data 
in  some  specific  way  that  allows  the  program  to  continue  to 
operate  in  a  seemingly  unaffected  manner  (from  the  trusted 
server’s  point  of  view),  but  on  corrupted  state  or  data. 

•  Impersonation.  An  impersonation  attack  is  similar  to 
intelligent  tampering  in  that  the  attacker  seeks  to  establish  a 
rogue  version  of  the  legitimate  program.  The  difference  lies 
in  that  the  former  attempts  to  emulate  the  behavior  of  the 
original  program,  while  the  latter  aims  to  modify  the  program 
or  its  data  directly. 

It  should  be  noted  that  denial-of-execution  attacks  are 
not  considered  here.  In  this  problem  context,  denial-of-exe¬ 
cution  produces  straightforward  symptoms  that  can  be 
readily  identified  by  the  trusted  server  (e.g.  loss  of  commu¬ 
nication).  Unlike  denial-of-execution,  an  intelligent  tamper¬ 
ing  or  impersonation  attack  may  not  be  immediately 
obvious;  if  the  attacker  has  detailed  knowledge  of  what  the 
software  is  supposed  to  do  and  the  appropriate  privilege  to 
instantiate  a  malicious  copy,  he  can  replace  the  original  pro¬ 
gram  and  make  the  replacement  virtually  undetectable. 
Such  attacks  therefore  have  the  potential  to  inflict  substan¬ 
tial  harm — the  adversary  could  manipulate  the  program  to 
perform  seemingly  valid  but  malicious  tasks. 

With  the  diversity  scheme  and  the  integrity  checks  in 
store,  a  successful  intelligent  tampering  or  impersonation 
attack  requires  knowledge  about  the  probe  algorithm  and 
the  communication  protocol  in  order  to  bypass  or  defeat  the 
checking  mechanism.  This  in  turn  requires  information 
about  the  program  semantics,  and  it  is  this  information  that 
we  endeavor  to  protect.  For  example,  consider  the  follow¬ 
ing  code  segment: 
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int  a  =  functionl (  ); 

int  b  =  function2 (  ) ; 

Check_for_intrusion (&a,  &b) ; 

p  =  &a; 

integrity_check (p) ; 

If  an  adversary  were  to  tamper  with  the 
Check_for_intrusion{ )  function,  he  or  she  needs  to  under¬ 
stand  whether  and  how  the  Check _for_intrusion{ )  function 
changes  the  values  of  a  and  b,  and  whether  a  or  b  will  be 
used  later  in  the  program.  Without  this  knowledge,  his 
action  can  be  revealed  when  integrity_check(p)  is  called. 

Our  premise  is  that  an  adversary  aiming  to  tamper  with 
or  impersonate  the  program  in  an  intelligent  way  must 
understand  the  effect  of  his  action,  and  this  boils  down  to  an 
understanding  of  the  program  semantics.  One  way  this 
understanding  can  be  acquired  is  through  program  analysis. 
This  paper  deals  with  obstruction  of  program  analysis,  in 
particular,  static  analysis  of  programs.  Our  approach  con¬ 
sists  of  a  framework  of  code  transformations  designed  to 
increase  the  difficulty  of  static  program  analysis,  and  that  is 
described  in  the  remaining  sections. 

3.  Static  analysis  of  programs 

Static  analysis  refers  to  techniques  designed  to  extract 
information  from  a  static  image  of  a  computer  program. 
Static  analysis  is  often  more  efficient  than  analyses  per¬ 
formed  dynamically  such  as  simulated  execution. 

From  the  software-protection  point  of  view,  static  anal¬ 
ysis  could  yield  useful  information  for  targeted  manipula¬ 
tion  of  software.  Consider  again  the  code  example  in  the 
last  section.  A  use- def  analysis  [11]  of  the  code  segment 
would  quickly  reveal  that  a  possible  definition  of  the  data 
quantity  a  in  function  Check. _for_intrusion()  will  be  propa¬ 
gated  to  a  use  (through  its  alias  p )  in  function 
integrity_check().  Based  on  this  knowledge,  an  adversary 
could  then  perform  specific  modification  to 
Check _for_intrusion{ )  so  long  as  he  leaves  the  semantics  of 
a  intact  for  its  later  use. 

Static  analysis  can  be  conducted  in  a  manner  that  is 
either  sensitive  or  insensitive  to  the  program  control-flow. 
Flow-insensitive  analysis  is  generally  more  efficient  at  the 
price  of  being  less  precise  [11,  14], 

A  flow  sensitive  analysis  first  constructs  the  Control- 
Flow  Graph  (CFG)  of  the  program.  Such  a  graph  consists  of 
nodes  which  are  basic  blocks  and  edges  that  indicate  con¬ 
trol  transfers  between  blocks.  The  analysis  then  proceeds  to 
solve  the  data-flow  problem  based  on  the  CFG. 

It  is  important  to  note  that  control-flow  analysis  is  the 
first  stage  of  the  analysis — it  provides  information  on  the 
program  call  structure  and  control  transfer  that  is  essential 
for  subsequent  data-flow  analysis.  Without  this  informa¬ 


tion,  data-flow  analysis  is  restricted  to  the  basic-block  level 
only  and  will  be  fundamentally  ineffective  for  programs 
where  data  usage  is  dependent  on  program  control-flow. 

The  technical  basis  of  our  approach  to  defeating  static 
analysis  is  to  transform  the  program  control-flow  to  a 
highly  data-dependent  nature;  that  is,  the  control-flow  and 
data-flow  analysis  are  made  co-dependent.  The  results  of 
this  co-dependence  are:  (1)  increased  complexity  of  both 
analyses;  and  (2)  reduced  analysis  precision. 

4.  Degeneration  of  control-flow 

In  a  normal  program,  determining  the  CFG  is  a 
straightforward  operation  when  branch  instructions  and  tar¬ 
gets  are  easily  identifiable — it  is  a  linear  operation  of  com¬ 
plexity  O(n),  where  n  is  the  number  of  basic  blocks  in  the 
program. 

The  first  set  of  code  transformations  that  we  employ 
modify  high-level  control  transfers  to  obstruct  static  detec¬ 
tion  of  the  program  CFG.  We  perform  this  transformation  in 
two  steps.  In  the  first  step,  high-level  control  structures  are 
transformed  into  equivalent  if-then-goto  constructs.  This 
transform  is  illustrated  in  Figure  2  in  which  the  sample  pro¬ 
gram  in  Figure  2(a)  is  transformed  into  the  structure  in  Fig¬ 
ure  2(b). 

Secondly,  we  modify  the  goto  statements  such  that  the 
goto  target  addresses  are  determined  dynamically.  In  C,  we 
implement  this  by  replacing  the  goto  statements  with  an 
entry  to  a  switch  statement,  and  the  switch  variable  is  com¬ 
puted  dynamically  to  determine  which  block  is  to  be  exe¬ 
cuted  next.  The  transformed  code  (based  on  the  code 
segment  of  Figure  2(a))  is  depicted  in  Figure  3. 

With  the  above  transformations,  direct  branches  are 
replaced  with  data-dependent  instructions.  As  a  result,  the 
CFG  that  can  be  obtained  from  static  branch  targets  degen¬ 
erates  to  a  flattened  form  shown  in  Figure  3.  It  can  be 
shown  that  this  degenerate  form  is  equivalent  to  the  control- 
flow  perceived  by  a  flow  insensitive  analysis  [14],  Without 


int  a,  b; 
a  =  1 ; 
b=2; 

while(a<1 0){ 
b=a+b; 
if  <  b  >10) 
b~; 
a++; 

1 

use(b); 

(a) 


Figure  2:  Dismantling  high-level  constructs 
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swVar  =  1 


Figure  3:  Transforming  to  indirect  control  transfers 


knowledge  of  the  branch  targets  and  the  execution  order  of 
the  code  blocks,  every  block  is  potentially  the  immediate 
predecessor  and/or  successor  of  every  other  block. 

In  the  absence  of  the  branch-target  information,  the 
complexity  of  building  the  static  CFG  is  determined  by  how 
easy  it  is,  at  each  branching  point,  to  discern  the  latest  defi¬ 
nition  of  the  switch  variable.  This  is  exactly  a  classical  use- 
n-def  data-flow  problem  [11].  Note  that  we  have  trans¬ 
formed  the  control-flow  analysis  into  a  data-flow  problem. 
The  complexity  of  data-flow  analyses  is  influenced  by  vari¬ 
ous  program  characteristics  such  as  aliasing  [11].  We  show 
in  the  next  section  how  manipulation  of  data  flow  charac¬ 
teristics  can  yield  additional  complexity  for  data-flow  anal¬ 
ysis  and  ultimately  render  static  analysis  an  extremely 
difficult  problem,  if  not  entirely  infeasible. 

5.  Data-flow  transformations 

After  the  transformations  described  in  Section  4,  the 
complexity  of  building  the  CFG  now  hinges  on  the  com¬ 
plexity  of  determining  branch  targets,  which  is  in  essence  a 
use-n-def  data-flow  problem.  Many  classical  data-flow 
problems  are  proven  to  be  NP-complete[12,  16].  A  funda¬ 
mental  difficulty  that  data-flow  analysis  must  deal  with  is 
the  existence  of  aliases  in  the  program.  Alias  detection  is 
essential  to  many  data-flow  problems.  For  example,  in 
order  to  determine  the  live  definition  problem,  a  data-flow 
algorithm  must  understand  the  alias  relationships  among 
variables  since  data  quantities  can  be  modified  when 
assignments  are  performed  on  any  aliased  names. 

Our  second  set  of  transformations  focuses  on  the  intro¬ 
duction  of  non-trivial  aliases  into  the  program  to  influence 
the  computation  and  the  analysis  of  the  branch  targets. 
These  transformations  include  the  following  techniques: 

Index  computation  of  branch  targets:  Consider  the 
code  segment  in  Figure  4(a).  A  use-n-def  analysis  to  ana¬ 
lyze  where  the  switch  variable  swVar  (contains  branch  tar¬ 
get  information)  is  defined  is  straightforward  (the  dashed 


line  indicates  a  use-def  information  chain).  Now  consider 
the  code  segment  in  Figure  4(b)  in  which  a  global  array 
“ global_array"  is  introduced  and  the  value  of  swVar  is 
computed  through  the  elements  of  the  array  (flQ  and  f2() 
indicate  complex  expressions  of  subscript  calculation). 
Replacing  the  constant  assignment  in  Figure  4(a)  with  indi¬ 
rect  accesses  of  the  array  implies  that  the  static  analyzer 
must  deduce  the  array  values  before  the  value  of  swVar  can 
be  determined 

Aliases  through  pointer  manipulation:  We  introduce 
aliases  in  the  following  steps: 

•  In  each  function,  we  introduce  an  arbitrary  number  of 
pointer  variables.  We  insert  artificial  basic  blocks,  or  code  in 
existing  blocks,  that  assign  the  pointers  to  existing  data 
variables  including  elements  of  the  global  array. 

•  We  replace  references  to  variables  and  array  elements 
with  indirection  through  these  pointers.  Previously 
meaningful  computations  on  data  quantities  are  replaced 
with  semantically  equivalent  computation  through  their 
aliased  names  (assignments  to  the  global_array  elements 
may  appear  as  assignments  to  a  pointer  variable) 

•  As  much  as  possible,  uses  of  the  pointers  and  their 
definitions  are  placed  in  different  blocks.  This  is  to  introduce 
difficulties  for  the  use-n-def  analysis. 

Some  of  the  basic  blocks  will  execute,  and  others  are 
simply  dead  code.  Since  the  static  analyzer  does  not  know 
which  blocks  actually  execute,  and  since  definition  of  the 
pointers  and  their  uses  are  placed  in  different  blocks,  the 
analyzer  will  not  be  able  to  deduce  which  definition  is  live 
at  each  use  of  the  pointer — all  pointer  assignments  will 
appear  live. 

For  example,  a  static  analysis  performed  on  the  code 
segment  in  Figure  5(a)  can  quickly  determine  that  only  the 
second  definition  of  the  pointer  variable  p  will  carry  to 
point  A  during  execution.  However,  if.  the  basic  block  in 
Figure  5(a)  is  decomposed  into  two  blocks  and  the  transi- 
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Figure  4:  Example  illustrating  dynamic  computation  of  branch  targets 


tion  between  blocks  is  obfuscated  using  our  flcitten-and- 
jump  technique  as  depicted  in  Figure  5(b),  the  static  ana¬ 
lyzer  will  report  both  alias  relations  <*p,  a>  and  <*p,  b> 
because  it  does  not  know  which  block  executes  first. 

Figure  6  illustrates  example  transformations  as  applied 
to  the  program  in  Figure  2(a).  The  result  of  the  transforms 
is  the  following:  a  static  analyzer  will  report  imprecise  alias 
relations  that  suggest  that  the  global  array  is  altered,  and 
that  its  contents  do  not  remain  static.  With  sufficient  alias 
introduction,  the  analysis  will  resolve  an  array  element  to  a 
large  set  of  possible  values.  This  in  turn  implies  that,  at 
each  use,  the  switch  variable  that  controls  the  flow  of  exe¬ 
cution  in  the  degenerate  form  of  the  program  can  take  on  a 
large  set  of  values. 

It  can  be  argued  that  if  an  adversary  can  capture  the  ini¬ 
tial  value  of  swVar,  he  can  then  find  the  first  block  to  be 
executed,  and  from  there  identify  each  subsequent  block. 
While  this  may  allow  the  adversary  to  recover  some  of  the 
original  control  How,  it  is  important  to  note  that  this  analy¬ 
sis  requires  an  interpretation  of  every  preceding  block  in 
order  to  recover  the  current  basic  block — an  effort  that 
exceeds  the  cost  of  most  static  analyses. 

It  can  also  be  argued  that  simulation  is  required  only 
once  for  each  block,  and  as  a  result,  the  complexity  of  ana- 


<*p,  b>  <*p,  a>  <*p,  b> 


Figure  5:  Introducing  aliases  through  pointers 


lyzing  such  a  program  lies  somewhere  between  static  analy¬ 
sis  and  a  full  execution  trace,  with  analysis  time 
proportional  to  the  number  of  blocks  in  the  program.  One 
way  to  defeat  this  analysis  is  to  unroll  loops  and  introduce 
semantically  equivalent  basic  blocks  that  will  be  chosen 
randomly  during  execution.  Consequently,  the  effort 
required  in  recovering  the  program  control-flow  will  be 
comparable  to  a  full  simulation.  In  addition,  the  initial  com¬ 
putation  of  swVar  can  be  erased  from  memory  once  it  is 
used  to  avoid  unnecessary  exposure  of  information. 

6.  Complexity  evaluation 

We  have  thus  far  conjectured  that  the  difficulty  of  dis¬ 
cerning  indirect  branch  target  addresses  is  influenced  by 
aliases  in  the  program.  In  this  section,  we  support  this  claim 
by  presenting  a  proof  in  which  we  show  that  determining 
precise  indirect  branch  addresses  statically  is  a  NP-hard 
problem  in  the  presence  of  general  pointers. 

6.1.  A  NP-hard  argument 

Theorem  1:  In  the  presence  of  general  pointers,  the 
problem  of  determining  precise  indirect  branch  target 
addresses  is  NP-hard. 

Proof:  Our  proof  consists  of  a  polynomial  time  reduc¬ 
tion  from  the  3-SAT  problem  to  that  of  determining  precise 
indirect  branch  targets.  This  is  a  variation  of  the  proof  orig¬ 
inally  proposed  by  Myers  in  which  he  proved  that  various 
data-flow  problems  are  NP-hard  in  the  presence  of  aliases 
[12].  Landi  later  proposed  a  similar  proof  to  prove  that  alias 
detection  is  NP-hard  in  the  presence  of  general  pointers 
[16].  The  detailed  reduction  can  be  found  in  other  extended 
documents  [21,  22]. 

The  NP-hardness  proof  establishes  that  the  problem  of 
statically  determining  branch  target  addresses  is  NP  hard  in 
the  presence  of  general  pointers.  This  result  applies  to  the 
set  of  general  programs  (with  general  pointers),  which,  at 
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the  first  brush  of  reaction,  may  not  be  the  same  as  the  set  of 
programs  produced  by  our  transformations.  We  must  fur¬ 
ther  establish  that  the  set  of  transformed  programs  does  not 
represent  a  restricted  class  of  programs  and  that  the  proof 
also  applies.  We  approach  this  as  follows. 

Assuming  the  set  of  general  programs  is  A  and  the  set 
of  programs  produced  by  our  transformations  is  A  to  show 
that  A  ’  is  not  a  restricted  subset  of  A  (with  respect  to  the  NP 
hard  proof)  it  suffices  to  show  that 

1)  There  is  a  polynomial  time  mapping  for  every 
instance  a  in  A  to  a  functionally  equivalent  instance  in  A’. 

2)  If  there  is  a  polynomial-time  algorithm  to  resolve 
indirect  branch  targets  for  any  instance  in  A’,  then  this  algo¬ 
rithm  can  be  used  to  resolve  indirect  branch  targets  for 
instances  of  A. 

Establishing  a  polynomial  time  mapping  from 
instances  of  A  to  instances  of  A’  is  straightforward;  this 
mapping  consists  of  exactly  the  code  transformations  we 
described  in  Section  4  and  5. 

Because  the  transformations  introduced  in  Section  4 
and  5  are  semantics-preserving  transformations,  an  algo¬ 
rithm  that  resolves  indirect  branch  targets  for  an  instance  in 
A’  will  by  definition  resolve  indirect  branch  targets  for  its 
functionally  equivalent  instance  in  A.  More  intuitively,  if  all 
the  indirect  branching  targets  for  an  instance  in  A’  are 
resolved  to  direct  jumps,  it  is  a  polynomial-time  task  to 
restore  the  original  control-constructs  (from  the  flattened  if- 
else-goto  constructs)  and  therefore  deduce  the  branch  tar¬ 
gets  for  the  original  program  in  A. 

The  reduction  from  3-SAT  does  not  make  use  of  any 
program  characteristics  other  than  multiple  levels  of  pointer 
dereferencing  and  conditional  branches.  The  transforma¬ 
tions  described  in  Section  4  and  5  preserve  the  presence  of 
conditional  branches  and  arbitrary  levels  of  pointers  and 
pointer  dereferencing.  From  an  intuitive  standpoint,  this 


suggests  that  the  reduction  from  3-SAT  also  stands  for  the 
transformed  program. 

6.2.  Complexity  evaluation  for  approximation 
analysis  methods 

While  the  NP-hard  result  bode  well  for  the  alias-based 
code  transformations,  we  still  need  to  evaluate  our 
approach  against  possible  heuristics  and  approximation 
methods.  In  this  section,  we  explore  the  effect  of  two  analy¬ 
sis  methods:  brute-force  search  and  alias  approximations. 

Brute-force  search  method.  To  determine  the  execu¬ 
tion  order  of  the  code  blocks  that  appear  in  the  degenerate 
form  of  the  program,  an  adversary  might  employ  a  brute- 
force  search  method  in  which  all  combinations  of  the  code 
block  ordering  are  explored.  This  is  a  naive  exhaustive 
search  heuristic  in  which  each  block  is  considered  equally 
likely  to  be  the  immediate  successor  of  the  current  block 
(including  the  current  block  itself).  The  time  complexity  of 
such  a  brute-force  method  is  0(nk),  where  n  is  the  number 
of  distinct  program  blocks  and  k  is  the  number  of  blocks 
that  will  be  executed.  Clearly,  this  represents  the  worst-case 
time  complexity  and  is  extremely  inefficient  when  the  value 
of  n  and  k  are  sufficiently  large. 

Alias-detection  approximation  methods.  The  prob¬ 
lem  of  precise  alias  detection  in  the  presence  of  general 
pointers  and  recursive  data  structures  is  undecidable  [11, 
16].  In  practice,  however,  approximation  algorithms  are 
often  used  [11].  An  alias  analysis  algorithm  may  analyze 
aliases  intra-procedurally  as  well  as  across  procedural 
boundaries. 

Intra-procedural  alias  analysis  requires  as  input  the 
alias  set  holding  at  the  entry  node  of  the  procedure,  the  alias 
set  propagated  back  from  any  procedure  called  within  the 


int  *p1 ,  *p2,  *p3; 


Figure  6:  An  example  transform  using  pointer  manipulation 
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current  procedure,  and  the  alias  processing  functions  (trans-  PCGs,  which  will  further  degrade  analysis  results.  A 

fer  functions)  of  each  pointer  assignment  statement.  Well-  detailed  discussion  on  this  topic  as  well  as  an  in-depth  study 

known  data-flow  frameworks  [11,  17]  exist  for  handling  of  the  complexity  of  existing  alias  analysis  frameworks  can 

intra-procedural  alias  analysis.  They  are  divided  into  flow-  be  found  in  [21]. 

sensitive  and  flow-insensitive  methods.  Flow-sensitive 

methods  make  use  of  control-flow  path  information,  and  7.  Implementation  and  Performance  Results 
are  more  precise  than  flow-insensitive  methods.  The  trans¬ 
formations  described  in  Section  4  and  5  produce  a  degencr-  We  implemented  the  transformations  in  a  source  trans- 

ate  form  of  static  control  flow.  As  a  result,  flow-sensitive  lator  for  ANSI  C  in  the  SUIF  programming  environment 

analysis  conducted  on  this  form  of  control  flow  loses  preci-  [1],  In  our  implementation,  we  developed  compiler  passes 

sion  advantage  it  has  over  the  flow-insensitive  methods.  for  the  code  transformations.  Each  pass  traverses  the  SUIF 

Figure  7  illustrates  such  an  example.  representation  and  performs  the  desired  modifications.  The 

The  CFG  in  Figure  7  shows  that  the  assignment  q  =  &c  exact  transformations  are  determined  by  a  random  seed: 

overwrites  the  alias  relation  <*q,  b>,  and  p  =  &b  overwrites  that  is,  the  resulting  program  is  different  for  each  compila- 

<*p,  a>.  A  flow-sensitive  alias  analysis  algorithm,  con-  tion.  For  example,  the  layout  of  the  global  array,  the  exact 

ducted  on  the  control  flow  in  Figure  7(a),  would  result  in  an  percentage  of  the  control-transfers  that  are  transformed,  and 

alias  set  [<*a,  c>,  <*p,  d>,  <*q,  c>]  for  this  segment.  The  the  number  of  dead  blocks  that  are  added  are  all  determined 

degenerate  control  flow  in  Figure  7(b)  essentially  repre-  by  a  random  number  generated  from  the  seed, 

sents  the  set  of  all  possible  paths  with  these  blocks.  Even  a  We  tested  performance  results  obtained  with  experi- 

flow-sensitive  analysis  algorithm  at  best  must  conclude  mental  transformations  on  the  SPEC95  benchmark  pro- 
with  the  alias  set  <*a,  c>,  <*p,  a>  <*p,  d>,  <*q,  c>,  <*q,  grams.  Of  issue  here  are  three  measures:  Run-time 

b>],  Horwitz  [14]  presented  a  definition  of  precise  flow-  performance  of  the  transformed  program,  performance  of 

insensitive  alias  analysis.  Under  this  definition,  the  flow-  static  analysis,  and  precision  of  static  analysis. 
sensitive  analysis  result  obtained  from  the  CFG  in  Figure  By  run-time  performance  of  the  transformed  programs, 

7(b)  is  exactly  the  same  result  as  a  precise  flow-insensitive  we  mean  the  execution  time  and  the  executable  object  size 

algorithm  would  have  concluded  with  the  CFG  in  Figure  after  transformation.  These  measures  reflect  the  cost  of  the 

7(a).  We  thus  conjecture  that,  with  the  degenerate  form  of  transformation.  By  performance  of  static  analysis,  we  mean 

control  flow,  a  flow-sensitive  analysis  can  be  no  more  pre-  the  time  taken  for  the  analysis  tool  to  reach  closure  and  ter- 

cise  than  its  flow-insensitive  counterpart.  minate.  A  related  but  equally  important  criterion  is  the  pre- 

The  transformations  presented  in  this  paper  are  intra-  cision  of  static  analysis,  which  indicates  how  accurate  the 

procedural  transforms  in  the  sense  that  they  do  not  affect  a  analysis  result  is  compared  to  the  true  alias  relationships, 
control-flow  analysis  on  the  procedural  level.  However,  an 

inter-procedural  alias  analysis  is  inherently  based  on  the  7.1.  Performance  of  the  transformed  program 
result  of  intra-procedural  alias  analysis,  therefore  its  preci¬ 
sion  suffer  likewise.  A  step  beyond  the  current  scheme  is  to  The  following  data  was  obtained  by  applying  our  trans- 

generalize  the  transformations  to  produce  degenerate  forms  to  SPEC95  benchmark  programs.  Three  SPEC  pro- 


Figure  7:  Effect  of  the  degenerate  control  flow  on  alias  analysis 
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grams  are  used  in  this  experiment,  Compress95,  Go  and  LI. 
Go  is  a  branch-intensive  implementation  of  the  Chinese 
board  game  GO.  Compress95  implements  a  tightly-looping 
compression  algorithm,  and  LI  is  a  typical  input-output 
bound  program  for  a  LISP  interpreter.  These  programs  are 
standard  benchmarks  used  in  the  compiler  community. 
They  embody  three  major  classes  of  high-level  language 
constructs  that  are  widely  used  in  general  programming.  It 
would  be  more  satisfying,  however,  to  test  our  results  on 
the  class  of  networking  programs  for  which  this  solution 
was  intended.  But  in  the  absence  of  that,  we  believe  that 
these  test  programs  are  good  representatives  of  real-world 
programs. 

We  conducted  experiments  on  both  optimized  (with  the 
gcc  -O  option)  and  non-optimized  versions  of  the  programs. 
The  experiments  were  executed  on  a  SPARC  server.  The 
experimental  results  show  that,  in  both  cases,  the  perfor¬ 
mance-slowdown  increases  exponentially  with  the  percent¬ 
age  of  transformed  branches  in  the  program.  On  average, 
the  performance  speedup  due  to  optimization  is  signifi¬ 
cantly  reduced  when  a  more  substantial  portion  of  the  pro¬ 
gram  is  obfuscated. 

This  is  an  encouraging  result;  it  is  highly  suggestive 
(albeit  not  conclusive)  that  our  transformations  consider¬ 
ably  hindered  the  optimization  that  the  compiler  is  able  to 
perform. 


The  performance  of  Go  and  li  were  similar  for  both 
optimized  and  non-optimized  code.  Of  the  three  original 
programs,  compiler  optimization  performed  best  on  Com¬ 
press — a  whopping  80%  decrease  in  the  execution  time  due 
to  optimization.  However,  as  can  be  seen  in  Figure  9,  our 
transforms  removed  significant  optimization  potential  from 
Compress;  the  execution  speed  of  the  transformed  and  opti¬ 
mized  Compress  diverges  most  significantly  from  the  per¬ 
formance  of  the  original  optimized  program.  As  Compress 
is  a  loop-intensive  program,  it  is  likely  that  certain  analyses 
that  enabled  significant  loop  or  loop  kernel  optimization 
were  no  longer  possible  after  our  transform  was  performed. 

The  object  size  of  the  three  benchmarks  grew  with 
increased  branch  replacement  (see  Figure  10  and  Figure 
1 1).  Go,  a  branch-intensive  program,  shows  the  largest  code 
growth  with  our  transform.  For  80%  replacement  of  direct 
branches,  the  executable  size  increased  by  a  factor  of  3  for 
Go  and  Li,  and  by  roughly  10%  for  Compress.  Compress 
contains  relatively  fewer  static  branches,  and  this  resulted 
in  less  potential  for  code  growth  with  the  transform. 

We  believe  that  these  results  are  representative  of  many 
programs.  It  appears  that,  on  average,  replacing  50%  of  the 
branches  will  result  in  an  increase  of  a  factor  of  4  in  the 
execution  speed  of  the  program.  At  the  same  time,  the  pro¬ 
gram  will  nearly  double  in  size. 

The  object  size  of  the  three  benchmarks  grew  with 
increased  branch  replacement  (see  Figure  10  and  Figure 
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11).  Go,  a  branch-intensive  program,  shows  the  largest  code 
growth  with  our  transform.  For  80%  replacement  of  direct 
branches,  the  executable  size  increased  by  a  factor  of  3  for 
Go  and  Li,  and  by  roughly  10%  for  Compress.  Compress 
contains  relatively  fewer  static  branches,  and  this  resulted 
in  less  potential  for  code  growth  with  the  transform. 

We  believe  that  these  results  are  representative  of  many 
programs.  It  appears  that,  on  average,  replacing  50%  of  the 
branches  will  result  in  an  increase  of  a  factor  of  4  in  the 
execution  speed  of  the  program.  At  the  same  time,  the  pro¬ 
gram  will  nearly  double  in  size. 

In  these  experiments,  we  used  a  random  algorithm  to 
choose  which  branch  to  transform.  An  obvious  future 
improvement  is  to  employ  intelligence  to  do  the  following: 
a)  identify  the  regions  of  the  program  that  require  greater 
protection  from  static  analysis,  and  b)  selectively  perform 
transformation  on  the  less-often-executed  branches  for  bet¬ 
ter  performance  penalty.  Trade-offs  between  these  two  cri¬ 
teria  need  to  be  considered  for  the  most  effective  solution. 

7.2.  Performance  and  precision  of  static  analysis 

In  this  experiment,  we  test  our  techniques  against  exist¬ 
ing  analysis  tools  and  algorithms.  The  state-of-the-art  anal¬ 
ysis  tools  include  the  NPIC  tool  [13]  and  the  PAF  toolkit 
[18].  They  both  implement  an  inter-procedural,  flow-sensi¬ 
tive  algorithm.  Both  NPIC  and  PAF  perform  control-flow 
analysis  exactly  once  with  no  further  refinement  on  the 
flow  graph. 

In  our  experiments,  PAF  successfully  analyzed  small 
sample  programs  (run  to  completion)  but  failed  to  handle 
some  of  the  large  programs  included  in  the  SPEC  bench¬ 
marks.  The  failure  characteristics  were  inconclusive  as  to 
whether  the  analysis  failed  due  to  difficulties  incurred  in  the 
alias  analysis  or  an  inability  of  handling  the  size  of  the  orig¬ 
inal  input  program.  The  test  cases  that  we  successfully 
completed  with  PAF  included  a  wide  range  of  sample  pro¬ 
grams  that  contain  extensive  looping  constructs  and  branch¬ 
ing  statements.  In  each  of  these  test  cases,  PAF  terminated 
reporting  the  largest  possible  number  of  aliases  in  the  pro¬ 
gram;  in  other  words,  it  reported  that  any  pointer  variable  is 
possibly  aliased  to  every  variable  that  ever  appeared  on  the 
left  hand  side  of  an  assignment  statement.  Because  of  the 
size  of  the  test  programs,  we  observed  negligible  differ¬ 
ences  in  the  pre-  and  post-transformation  analysis  time.  The 
experience  with  the  PAF  tool,  albeit  with  limited  test  cases, 
indicated  that  PAF  failed  to  resolve  aliases  across  the  flat¬ 
tened  basic  blocks,  and  that  our  technique  of  making  data¬ 
flow  and  control-flow  co-dependent  presents  a  fundamental 
difficulty  that  existing  analysis  algorithms  lack  the  sophisti¬ 
cation  to  handle. 

NPIC  implements  a  slightly  more  aggressive  algorithm 
that  includes  features  such  as  function-pointer  analysis.  It 


performs  an  iterative  analysis  interleaving  the  inter-and 
intra-procedural  analysis.  Every  time  new  aliasing  informa¬ 
tion  is  generated  by  an  intra-procedural  phase,  it  is  propa¬ 
gated  to  its  successor  functions  which  then  repeat  their 
intra-procedural  analysis,  and  so  on,  until  the  alias  set  con¬ 
verges.  Unfortunately,  IBM  no  longer  maintains  and  dis¬ 
tributes  the  tool.  The  experience  with  NPIC  was  therefore 
limited  to  analytical  experiments  with  the  NPIC  algorithm. 

A  limited  number  of  experiments  with  the  NPIC  algo¬ 
rithm  were  conducted  on  small  programs.  These  experi¬ 
ments,  to  the  extent  that  a  semi-automated  analysis  would 
allow,  revealed  that  little  accuracy  was  achieved  when  the 
analysis  terminates. 

In  a  particular  instance  where  index  computation  and 
aliasing  were  used  to  compute  branch  targets,  NPIC  started 
out  indicating  that  the  elements  of  the  global  array  could 
contain  a  number  of  possible  values.  As  the  iterations  went 
on,  this  information  was  never  refined.  Rather,  alias  rela¬ 
tions  identified  in  later  iterations  increased  the  set  of  possi¬ 
ble  values  that  the  array  elements  were  deemed  to  have. 
The  algorithm  eventually  terminated  and  claimed  that  the 
elements  of  the  global  array  were  changed  an  arbitrary 
number  of  times,  and  that  they  could  contain  arbitrary  val¬ 
ues.  Computations  involving  the  array  elements  were 
deemed  unanalyzable.  This  in  turn  implied  that  the  indirect 
branching  targets  cannot  be  determined  precisely.  Alias 
information  propagation  among  those  blocks  therefore  did 
not  get  easier  and  alias  relations  were  never  refined. 

8.  Conclusion 

The  problem  of  protecting  trusted  software  from 
untrustworthy  hosts,  is  important  for  many  critical  func¬ 
tions  in  modern  networks.  Consider,  as  an  example,  distrib¬ 
uted  intrusion  detection  systems  in  which  parts  of  the  ID 
programs  need  to  operate  on  untrustworthy  hosts.  Serious 
consequences  will  arise  if  these  programs  were  the  targets 
of  malicious  attacks  and  were  compromised. 

In  this  paper,  we  considered  one  significant  class  of 
attacks,  namely  those  based  on  static  analysis  of  the  binary 
form  of  the  program.  We  presented  a  strategy  for  defeating 
analysis  by  tightly  coupling  the  control  flow  and  the  data 
flow  of  the  program.  Since  data-flow  analysis  of  acceptable 
precision  is  dependent  on  the  control-flow  information,  this 
approach  is  capable  of  expanding  analysis  time  consider¬ 
ably  and  reducing  the  precision  of  the  analysis  to  useless 
levels.  The  theoretical  bound  that  we  have  established 
shows  that  analysis  of  programs  that  have  been  transformed 
in  this  manner  is  NP  hard. 

We  have  developed  a  practical  instantiation  of  the 
transformation  in  the  form  of  a  compiler  for  ANSI  C.  The 
compiler  makes  a  number  of  changes  to  the  program  source 
including:  degeneration  of  the  program  control  flow;  the 
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systematic  and  general  creation  of  aliases;  and  the  introduc¬ 
tion  of  data-dependent  branches.  We  note  that  these  trans¬ 
formations  are  not  dependent  on  a  C-like  pointer 
paradigm — they  can  be  applied  to  any  intermediate  repre¬ 
sentation  where  explicit  memory  references  exist. 

In  proof-of-concept  experiments  that  we  have  con¬ 
ducted  on  sample  programs,  the  transformed  versions 
defeat  currently  available  static-analysis  tools.  Although 
such  experiments  are  not  and  could  never  be  definitive  evi¬ 
dence,  we  regard  these  results  as  promising  indications  that 
we  have  a  practical  approach  to  defeat  static  analysis. 

We  note  that  the  described  transformations  produce 
programs  with  a  considerable  level  of  code  diversity  (trans¬ 
forms  are  randomly  chosen  on  a  per  compilation  basis). 
Such  programs,  when  deployed  at  various  points  in  a  net¬ 
work,  are  highly  resilient  to  class  attacks  since  most  class 
attacks  exploit  common  software  flaws. 

It  is  important  to  note  that  the  purpose  of  this  work  is  to 
eliminate  the  possibility  that  a  static  analysis  can  be  used  to 
deduce  useful  information  for  software  tampering  or  imper¬ 
sonation.  In  other  words,  the  optimal  result  is  that  there 
should  be  no  efficient  way  to  analyze  the  program  other 
than  an  actual  execution.  We  also  note  that  many  forms  of 
dynamic  program  analysis  make  use  of  static  information 
[3,  10],  and  the  techniques  described  in  this  paper  will  be 
helpful  in  defending  against  these  forms  of  analyses. 
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